Selective cloning of homoduplex nucleic acids

ABSTRACT

The subject invention provides for a method of selectively cloning homoduplex nucleic acid molecules, in particular, by using a strain of host cells that contains a conditionally expressed and/or conditionally active mismatch-recognizing enzyme, e.g., a temperature sensitive variant of the gene encoding the endonuclease VII from phage T4. Using this host strain, the invention features a novel cloning method that selects for PCR products that are devoid of PCR-generated mutations.

RELATED APPLICATION

This application is a continuation of U.S. application Ser. No.10/246,838, filed Sep. 19, 2002, which is a continuation-in-part of U.S.application Ser. No. 10/180,174, filed Jun. 26, 2002, the entirecontents of which are incorporated herein by reference.

BACKGROUND

Maintaining the fidelity of a nucleic acid sequence is of paramountimportance in many aspects of molecular biology. During the cloning of agene sequence, for example, the cloning processes themselves canintroduce artifacts into the sequence being cloned, resulting in theisolation of artificial variant sequence, rather than the true originalsequence.

For instance, PCR amplification is known to create mutations at a muchhigher rate than in vivo propagation (replication) of DNA. The mutationrate will vary with PCR conditions, choice of enzyme, number ofreplication cycles, etc. After 30 rounds of PCR amplification, it isestimated that between 2% and 10% of the molecules (depending on theirsize) contain at least one mutation. Thus, when PCR amplified DNA iscloned into a plasmid vector, 2-10% of the cloned inserts will containmutations. Depending on the downstream application, it is oftennecessary to sequence the amplified DNA prior to furtherexperimentation. However, since many cloning procedures involve someamplification steps, sequencing can only show if additional variationshave been introduced, and cannot demonstrate that the current sequenceis true to the original.

In another instance, when assaying sequences within samples for thepresence of alterations that indicate the possibility of geneticdisease, it becomes critical to know that one is working with the actualsequence originally sampled, rather than an amplification-inducedartifact.

In studying variations between the genomes of related organisms andpopulations, it is also critical to know that one is studying actualsequence differences instead of artifacts. Some assays of geneticdiversity within and between human populations, for example, test forthe presence of single nucleotide polymorphisms (SNP). Variations insequences that are actually due to amplification can produce in spuriousresults.

There is therefore a need in the art for methods that select for PCRamplification products and other nucleic acids that do not containPCR-induced mutations or other alterations.

SUMMARY OF THE INVENTION

The invention provides for a method of selectively cloning homoduplexnucleic acid molecules, that is, a method of cloning nucleic acidmolecules such that the nucleic acid molecules cloned by the methodcontain a reduced number of heteroduplexes, mismatches, and changes overthe parent sequence, relative to nucleic acid molecules not cloned bythe method described herein.

For example, in one embodiment, a host strain is provided that containsa temperature sensitive variant of the gene encoding the endonucleaseVII from phage T4, e.g., the nucleic acid sequence of SEQ ID NO: 1,encoding the amino acid sequence of SEQ ID NO:2. The present inventionalso provides a method for the selective cloning of amplified nucleicacids with a reduced number of PCR induced mutations, relative toamplified nucleic acids cloned by other methods.

In one embodiment, the invention provides for a host cell forselectively cloning homoduplex DNA molecules. The host cell iscompetent, and contains a heterologous gene encoding a conditionallyexpressed and/or conditionally active resolvase, and the host cell canbe maintained under conditions where the resolvase is not expressedand/or is repressed and propagated in inactive form. The host cell canbe eukaryotic or prokaryotic. The resolvase can be a eukaryoticresolvase, or a bacteriophage resolvase, e.g., bacteriophage T4endonuclease VII, e.g. a temperature sensitive mutant of bacteriophageT4 endonuclease VII, e.g., the temperature sensitive mutant ofbacteriophage T4 endonuclease VII, e.g., SEQ ID NO:2. The resolvase canbe an enzyme the expression of which is completely repressed undercertain conditions.

The invention also features a method of selectively cloning homoduplexnucleic acid molecules, where the method includes providing one or morehost cells containing a heterologous gene encoding a conditionallyexpressed and/or conditionally active resolvase, transforming the hostcells with cloned nucleic acid molecules, and then maintaining thetransformed host cells so that the resolvase is expressed and active,and cleaves the cloned heteroduplex molecules, leaving the homoduplexmolecules.

The method can further include denaturing and renaturing the nucleicacid molecules before ligation. The method can also include, afteraction of the resolvase, infecting the cells with helper phage to rescuethe homoduplex cloned nucleic acid molecules. The method can alsoinclude maintaining the host cells under conditions that prohibit theexpression of the resolvase.

In another aspect, the invention features a method of selectivelycloning amplified nucleic acid molecules possessing a reduced number ofmutations, where the method includes providing one or more host cellscontaining a heterologous gene encoding a conditionally expressed and/orconditionally active resolvase, transforming the host cells with clonedamplified nucleic acid molecules, and then maintaining the transformedhost cells under conditions suitable for expression of resolvase, sothat the cloned heteroduplex molecules are cleaved by the resolvase,leaving the homoduplex cloned molecules.

The method can include the further step of denaturing and renaturing thenucleic acid molecules before ligation. The method can include thefurther step of, after action of the resolvase, infecting the cells withhelper phage to rescue the homoduplex ligated nucleic acid molecules.The method can also include the further step of maintaining the hostcells under conditions that prohibit the expression of the resolvase.The host cell can be eukaryotic or prokaryotic. The cloning vector canbe a phagemid cloning vector or a cosmid vector. The conditionallyexpressed resolvase can be a eukaryotic resolvase or a bacteriophageresolvase, e.g., bacteriophage T4 Endonuclease VII, e.g., a temperaturesensitive mutant of bacteriophage T4 Endonuclease VII, e.g., thetemperature sensitive mutant of bacteriophage Endonuclease VII, e.g.,the protein of SEQ ID NO:2. The resolvase can be an enzyme theexpression of which is completely repressed under certain conditions.

The pool of cloned nucleic acid molecules can contain both hetero- andhomoduplex nucleic acids, and the heteroduplex molecules are cleaved bythe resolvase in the methods.

In an additional aspect, the invention features a kit comprising acompetent host cell containing a heterologous gene encoding aconditionally expressed resolvase, and packaging materials for same. Thehost cell can be maintained under conditions under which the resolvaseis expressed. The kit can also contain a host cell containing anexpression vector encoding such a resolvase, for example the temperaturesensitive mutant of bacteriophage T4 Endonuclease VII of SEQ ID NO:2, orthe host cell can contain an isolated DNA (e.g., SEQ ID NO:1) encodingthe temperature sensitive mutant of bacteriophage T4 Endonuclease VII ofSEQ ID NO:2.

In any of the above kits, the resolvase can be an enzyme the expressionof which is completely repressed under certain conditions. The kits canalso contain other reagents, such as PCR reagents, and packagingmaterials.

The invention also features an isolated protein of a temperaturesensitive mutant of bacteriophage T4 Endonuclease VII, where theisolated protein has the amino acid sequence of SEQ ID NO:2, and anisolated nucleic acid encoding such a mutant, where the isolated nucleicacid has the sequence of SEQ ID NO:1.

By the term “heteroduplex” is meant a structure formed between twoannealed, nucleic acid strands (e.g., the annealed strands of test andreference nucleic acids, or between template and product in anamplification reaction) in which one or more nucleotides in the firststrand are unable to appropriately base pair with those in the secondopposing, complementary strand because of one or more mismatches.

Most mismatch-recognizing enzymes and proteins are involved in nucleicacid repair, and so are highly specific. Thus, a duplex possessing justa single mismatch is likely to be recognized by such enzymes andproteins. For heteroduplexes containing high numbers of mismatches, ormultiple mismatches clustered closely together, it seems likely thatmost mismatch recognizing enzymes and proteins would require a spacingof at least three nucleotides between the mismatches in order to resolveeach mismatch, in order for the enzyme or protein to be properlysituated on the nucleic acid and to act on the mismatch. However, it isnot a requirement of the invention that the mismatch recognizing enzymeor protein be able to resolve every mismatch in a heteroduplex nucleicacid. Rather, the enzyme or protein needs only to either (1) bind to theheteroduplex nucleic acid, so that the enzyme/protein-heteroduplexcomplex can be removed, or (2) cleave in the vicinity of at least onemismatch, so that the resulting products can be selected on the basis ofsize, e.g., so that products of an expected size are selected, versussmaller (i.e., cleaved) products.

Examples of different types of heteroduplexes include those that exhibitdifferences over one or several nucleotides, and insertion or deletionmutations, each of which is disclosed in Bhattacharyya and Lilley, Nucl.Acids. Res. 17: 6821 (1989). The term “complementary,” as used herein,means that two nucleic acids, e.g., DNA or RNA, contain a series ofconsecutive nucleotides which are capable of forming matchedWatson-Crick base pairs to produce a region of double-strandedness(except in the region of mismatch). Thus, adenine in one strand of DNAor RNA pairs with thymine in an opposing complementary DNA strand orwith uracil in an opposing complementary RNA strand, and guanine in onestrand pairs with cytosine in an opposing strand. The region of pairingis referred to as a “duplex.” A duplex may be either a homoduplex or aheteroduplex. In a preferred embodiment, nucleic acid is subjected toamplification, e.g., by PCR, during which it is heat denatured andreannealed to generate homoduplexes and heterodplexes. In general, themethods described herein will work under those conditions that allow thehybridization of complementary strands of nucleic acid. “Mismatch”, asused herein, refers to a duplex in which less than all of thenucleotides on one strand are perfectly matched to the other strand(e.g., where nucleotide pairing other than adenosine-thymine orguanine-cytosine occurs, e.g., nucleotide paring such asadenosine-cytosine, adenosine-guanine, adenosine-adenosine,thymine-cytosine, thymine-guanine, thymine-thymine, guanine-guanine, orcytosine-cytosine occurs), where a deletion or insertion of one or moreDNA nucleotides on one strand as compared to the other complementarystrand occurs (e.g., a deletion of 1, 2, 5, 10, 15, or more nucleotidesor an insertion of 1, 2, 5, 10, 15, or more nucleotides occurs), orother mismatches between the two strand of the duplex occurs. DNAmismatches may arise from nucleic acid replication errors, mutagenesis,deamination of 5-methylcytosine, formation of thymidine dimers, nucleicacid recombination, etc.

A “mutation”, as used herein, refers to a nucleotide sequence change(i.e., a nucleotide substitution, deletion, or insertion) in an isolatednucleic acid relative to a reference nucleic acid. In one embodiment,the reference nucleic acid is a template-specific nucleic acid used inan amplification reaction.

The term “expression vector” as used herein refers to a recombinant DNAmolecule containing a desired coding sequence and appropriate nucleicacid sequences necessary for the expression of the operably linkedcoding sequence in a particular host organism. Nucleic acid sequencesnecessary for expression in prokaryotes usually include a promoter, anoperator (optional), and a ribosome-binding site, often along with othersequences. Eukaryotic cells are known to utilize promoters, enhancers,and termination and polyadenylation signals.

The terms “transform” and “transfect” as used herein refer to theintroduction of foreign DNA into prokaryotic or eukaryotic cells.Transformation of prokaryotic cells may be accomplished by a variety ofmeans known to the art including the treatment of host cells withvarious salt solutions or nonionic compounds (e.g., CaCl₂) to render thecells competent, electroporation treatment, etc. Transfection ofeukaryotic cells may be accomplished by a variety of means known to theart including calcium phosphate-DNA co-precipitation,DEAE-dextran-mediated transfection, polybrene-mediated transfection,electroporation, microinjection, liposome fusion, lipofection,protoplast fusion, retroviral infection, and biolistics.

A “mismatch recognizing” enzyme or protein is an enzyme or protein thatrecognizes a heteroduplex nucleic acid molecule and binds to it and/orcleaves it. A number of different enzymes qualify as mismatchrecognizing enzymes and proteins, including DNA repair enzymes,recombination proteins, and resolvases.

Resolvases are enzymes that process recombinational intermediates. Theyhave the secondary effect of acting on mismatched DNA, which in somerespects resembles a recombinational intermediate. “Resolvase”, as usedherein, refers to an enzyme that cleaves a nucleic acid as the result ofthe presence of a distortion in a duplex, e.g., a bend, kink or otherDNA deviation, e.g., a DNA mismatch, e.g., a single base pairsubstitution, insertion or deletion, in many different organisms,including bacteria, phage, yeast, and mammals, e.g., humans. The enzymeexerts its effect by mismatch-dependent cleavage, i.e., cleavage of atleast one DNA strand, close to the site of DNA distortion, e.g., a DNAmismatch.

Examples of resolvases include, without limitation, T4 endonuclease VII,Saccharomyces cerevisiae Endo X1, Endo X2, or Endo X3 (Jensch et al.,EMBO J. 8:4325, 1989), T7 endonuclease I, E. coli MutY (Wu et al., Proc.Natl. Acad. Sci. USA 89:8779-8783, 1992), mammalian thymine glycosylase(Wiebauer et al., Proc. Natl. Acad. Sci. USA 87:5842-5845, 1990),topoisomerase I from human thymus (Yeh et al., J. Biol. Chem.266:6480-6484, 1991; Yeh et al., J. Biol. Chem. 269:15498-15504, 1994),deoxyinosine 3′ endonuclease (Yao and Kow, J. Biol. Chem.269:31390-31396, 1994) and Mus81 (Boddy et al., 2001, Cell 107:537-48;Chen et al., Mol. Cell. 8:1117-27, 2001). In one embodiment, theresolvase is isolated from a bacteriophage, e.g., bacteriophage T3, T4or T7. In another embodiment, the resolvase is a mutated form (nucleicacid sequence, SEQ ID NO:1; amino acid sequence SEQ ID NO:2; FIG. 1) ofthe wild type endonuclease VII of phage T4 (GenBank Accession No.:X12629; nucleic acid sequence, SEQ ID NO:3; amino acid sequence SEQ IDNO:4; FIG. 2).

“Mismatch-dependent cleavage”, as used herein, refers to acharacteristic of an enzyme such as a resolvase. An enzyme has amismatch-dependent cleavage activity if it cleaves at a mismatch, at asignificantly higher rate, than it would cleave a correspondingperfectly matched sequence. In preferred embodiments, an enzyme with amismatch-dependent cleavage is at least about 5%, 15%, 25%, 50%, 75% or100% more efficient at cleaving at a mismatch than at a correspondingperfectly matched sequence.

As used herein, “conditionally expressed” and “conditionally active”refer to the expression of a mismatch recognizing enzyme or protein,which is only present and functional under certain conditions.

In one embodiment, the enzyme or protein is “conditionally expressed”,that is, the protein is only active under certain conditions, i.e.,permissive conditions. Under non-permissive conditions, the enzyme orprotein is not produced, and is not present.

In another embodiment, the enzyme or protein is “conditionally active”,that is, the gene encoding the enzyme or protein is always present, butis only active under certain conditions. In one such embodiment, theenzyme has a temperature sensitive mutation that renders the proteinnon-functional (i.e., inactive) at (e.g. warm temperatures, e.g. 37-42°C.), and functional (i.e., active) at colder temperatures. In one suchembodiment, the gene encoding the mutant protein or enzyme has anonsense mutation that results in the translation of a truncatedprotein. In a preferred embodiment, “conditionally-expressed” or“conditionally active” refer to the expression of a temperaturesensitive mutant T4 endonuclease VII protein that is active andfunctional but also lethal to the host cells at 25° C., for example, thetemperature sensitive mutant T4 endonuclease VII protein, SEQ ID NO:2,encoded by SEQ ID NO:1.

A “host cell” is a cell which has been transformed or transfected, or iscapable of transformation or transfection by a heterologouspolynucleotide sequence. Host cells can be prokaryotic or eukaryotic,mammalian, plant, or insect, and can exist as single cells, or as acollection, e.g., as a culture, or in a tissue culture, or in a tissueor an organism. In a preferred embodiment, prokaryotic host cells arebacteria that harbor the F′ episome that enables f1 phage rescue asdescribed herein. Host cells can also be derived from normal or diseasedtissue from a multicellular organism, e.g., a mammal. Host cell, as usedherein, is intended to include not only the original cell which wastransformed with a nucleic acid, but also descendants of such a cell,which still contain the nucleic acid.

In one embodiment, the “host cell” is a cell which contains a mismatchrecognizing enzyme or protein, where the activity of the enzyme orprotein can be controlled, i.e., be regulated by the one practicing theinvention. The mismatch recognizing enzyme or protein can be native tothe cell, or can be heterologous, that is, introduced into the cell,e.g., transfected or transformed into the cell. The enzyme or proteincan also be one which was originally native to the host cell, but hasbeen altered so that it is now capable of being controlled.

“Heterologous” nucleic acid refers to nucleic acid not naturally locatedin the cell, or in a chromosomal site of the cell. Preferably, theheterologous nucleic acid includes a nucleic acid foreign to the cell.

As used herein, “a mixture of DNA molecules” or “mixture of nucleic acidmolecules”, refer to DNA or other nucleic acid molecules which, whenaligned, may vary in sequence at one or more positions or at nopositions. In a preferred embodiment, the terms refer to DNA or othernucleic acid that was amplified, e.g., by the polymerase chain reaction.

As used herein, the term “amplified”, when applied to a nucleic acidsequence, refers to a process whereby one or more copies of a particularnucleic acid sequence is generated from a template nucleic acid,preferably by the method of polymerase chain reaction (Mullis andFaloona, 1987, Methods Enzymol. 155:335). “Polymerase chain reaction” or“PCR” refers to an in vitro method for amplifying a specific nucleicacid template sequence. The PCR reaction involves a repetitive series oftemperature cycles and is typically performed in a volume of 50-100 μl.The reaction mix comprises dNTPs (each of the four deoxynucleotidesdATP, dCTP, dGTP, and dTTP), primers, buffers, thermostable DNApolymerase, and nucleic acid template. The PCR reaction comprisesproviding a set of polynucleotide primers wherein a first primercontains a sequence complementary to a region in one strand of thenucleic acid template sequence and primes the synthesis of acomplementary DNA strand, and a second primer contains a sequencecomplementary to a region in a second strand of the target nucleic acidsequence and primes the synthesis of a complementary DNA strand, andamplifying the nucleic acid template sequence employing a nucleic acidpolymerase as a template-dependent polymerizing agent under conditionswhich are permissive for PCR cycling steps of (i) annealing of primersrequired for amplification to a target nucleic acid sequence containedwithin the template sequence, (ii) extending the primers wherein thenucleic acid polymerase synthesizes a primer extension product. “A setof polynucleotide primers” or “a set of PCR primers” can comprise two,three, four or more primers. Other methods of amplification include, butare not limited to, ligase chain reaction (LCR), polynucleotide-specificbased amplification (NSBA), or any other method known in the art.

As used herein, “nucleic acid polymerase” refers to an enzyme thatcatalyzes the polymerization of nucleotides. Generally, the enzyme willinitiate synthesis at the 3′-end of the primer annealed to a nucleicacid template sequence, and will proceed in the 5′-direction along thetemplate. “DNA polymerase” catalyzes the polymerization ofdeoxynucleotides. Known DNA polymerases include, for example, Pyrococcusfuriosus (Pfu) DNA polymerase, E. coli DNA polymerase I, T7 DNApolymerase, Thermus thermophilus (Tth) DNA polymerase, Bacillusstearothermophilus DNA polymerase, Thermococcus litoralis (Tli) DNApolymerase (also referred to as Vent DNA polymerase), Themotoga maritima(UlTma) DNA polymerase, Thermus aquaticus (Taq) DNA polymerase, andPyrococcus GB-D (PGB-D) DNA polymerase. The polymerase activity of anyof the above enzyme can be defined by means well known in the art. Oneunit of DNA polymerase activity, according to the subject invention, isdefined as the amount of enzyme which catalyzes the incorporation of 10nmoles of total dNTPs into polymeric form in 30 minutes at 72° C.

As used herein, “thermostable” refers to an enzyme (or protein) which isstable and active at temperatures as great as preferably between about90-100° C. and more preferably between about 70-98° C. as compared to anon-thermostable form of an enzyme with a similar activity that aretypically denatured at such elevated temperatures. For example, arepresentative thermostable nucleic acid polymerase isolated fromThermus aquaticus (Taq) is described in U.S. Pat. No. 4,889,818 and amethod for using it in conventional PCR is described in Saiki et al.(1988, Science 239:487). Another representative thermostable nucleicacid polymerase isolated from P. furiosus (Pfu) is described in Lundberget al. (1991, Gene 108:1-6). Additional representative temperaturestable polymerases include, e.g., polymerases extracted from thethermophilic bacteria Thermus flavus, Thermus ruber, Thermusthermophilus, Bacillus stearothermophilus (which has a somewhat lowertemperature optimum than the others listed), Thermus lacteus, Thermusrubens, Thermotoga maritima, or from thermophilic archaea Thermococcuslitoralis, and Methanothermus fervidus.

As used herein, a “PCR ligation mix” or “amplification ligation mix”refers to the reaction mix after ligation has occurred in which anamplified nucleic acid fragment is ligated to a recombinant nucleic acidvector that is capable of autonomous replication within a host cell. Ina preferred embodiment, the recombinant nucleic acid vector is aphagemid or a cosmid vector.

As used herein, “competent” cells refers to host cells that are primedfor the uptake of nucleic acids. Competent cells are treated to maketheir cell membranes more permeable in order to facilitate the entry ofheterologous nucleic acids.

The term “recombinant DNA vector” or “recombinant nucleic acid vector”as used herein refers to DNA or other nucleic acid sequences containinga desired coding sequence and appropriate sequences necessary for theexpression of the operably linked coding sequence in a particular hostorganism. Sequences necessary for expression in prokaryotes include apromoter, optionally an operator sequence, a ribosome-binding site andpossibly other sequences. Eukaryotic cells are known to utilizepromoters, polyadenylation signals and enhancers.

As used herein, “helper phage” refers to a normal wild-type version ofthe phage, which typically grows along with a specialized phage such asa phagemid (Bluescript, pUC and the like) and supplies whateverfunctions are necessary for generating phage particles. In oneembodiment, the “helper phage” is M13KO7, a M13 phage that is able toreplicate in E. coli in the absence of phagemid DNA. In the presence ofa phagemid bearing a wild-type M13 or f1 origin, single-strandedphagemid is packaged preferentially and secreted into the culturemedium.

As used herein, “rescue” refers to the recovery of homoduplex nucleicacid molecules that were not degraded according to the invention, by oneor more mismatch recognizing enzymes or proteins.

As used herein, “reducing the amount of nucleic acid heteroduplexes” or“reducing the number of nucleic acid heteroduplexes” refers to adecrease in the number of nucleic acid heteroduplexes in a mixture ofnucleic acid molecules as a result of the method of the invention. Inone embodiment, the amount and/or number of nucleic acid heteroduplexesare reduced by enzymatic cleavage by T4 endonuclease VII in a host cell.In a preferred embodiment, the decrease in the amount or number ofnucleic acid heteroduplexes in a mixture of nucleic acid moleculestreated according to the invention is at least 75%, preferably 90%, morepreferably 99% and most preferably 100% (i.e., no heteroduplexes remain)as compared to the amount or number of nucleic acid heteroduplexes in amixture of nucleic acid molecules not treated according to theinvention, e.g., not enzymatically cleaved by T4 endonuclease VII in ahost cell.

As used herein, “maintaining” the host cells refers to thoseexperimental conditions of IPTG induction, competence to nucleic acidtransformation and non-permissive or non-active conditions, i.e.,conditions where the mismatch recognizing enzyme or protein is notactive. In one embodiment, maintenance of the cells is at warmtemperatures, e.g., 37-42° C., i.e., high temperature that allows cellviability, expression of non-functional T4 endonuclease VII, digestionof heteroduplex ligated DNA and rescue of homoduplex DNA with f1 helperphage before the onset of host cell death.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram showing the nucleic acid (SEQ ID NO:1) and aminoacid (SEQ ID NO:2) sequences of the temperature-sensitive mutant T4endonuclease VII.

FIG. 2 is a diagram showing the nucleic acid (SEQ ID NO:3) and aminoacid (SEQ ID NO:4) sequences of the wild-type T4 endonuclease VII.

FIG. 3 is a diagram showing the alignment of the nucleic acid sequencesof the temperature-sensitive mutant T4 endonuclease VII nucleic acidsequence (SEQ ID NO:1) and the wild type T4 endonuclease VII nucleicacid sequence (SEQ ID NO:3), showing that the mutant has a single basedeletion at position 461 of the wild type sequence.

FIG. 4 is a diagram showing the alignment of the amino acid sequences ofthe temperature-sensitive mutant T4 endonuclease VII nucleic acidsequence (SEQ ID NO:2) and the wild type T4 endonuclease VII amino acidsequence (SEQ ID NO:4), showing that the mutant protein is missing thelast two amino acids of the wild type protein, and that the last twoamino acids of the mutant protein are substituted relative to thecorresponding two amino acids in the wild-type protein.

FIG. 5 is a diagram showing the nucleic acid (SEQ ID NO: 5) and aminoacid (SEQ ID NO: 6) sequences of a wild-type Green Fluorescent Protein.

FIG. 6 is a diagram showing the nucleic acid (SEQ ID NO: 7) and aminoacid (SEQ ID NO: 8) sequences of a mutant Green Fluorescent Proteincomprising a premature stop codon.

DETAILED DESCRIPTION

The invention provides for a method of selecting homoduplex nucleic acidmolecules out of a mixture of homoduplex and heteroduplex nucleic acidmolecules, that is, of selecting out those nucleic acid molecules thatcontain a reduced number of heteroduplexes, mismatches, and changesrelative to the parent sequence, compared to nucleic acid molecules notselected by the methods described herein. The method can be done invivo, that is, within a host cell, by cloning the nucleic acid moleculesto be selected into the host cell, where a mismatch recognizing enzymeor protein acts on the heteroduplex molecules, e.g., by cleaving them ator near the site of the mismatch.

In one embodiment, the subject invention provides for a host strain thatcontains a temperature sensitive variant of the gene encoding theendonuclease VII from phage T4. Using this host strain, the inventionfeatures a novel cloning method that selects for nucleic acids. e.g.,PCR products, that do not contain mutations, e.g., PCR generatedmutations.

In one embodiment, the invention provides for a competent host cellwhich specifically cleave a mismatched nucleic acid it contains. Such acell contains a plasmid containing a heterologous, conditionally-activeresolvase, e.g., a T4 endonuclease VII enzyme which has been mutated tobe conditionally active. In one embodiment, the enzyme has been mutatedand is temperature-sensitive, e.g., is active when the cells are grownat normal temperatures, and inactive when the cells are grown atnon-permissive (e.g., elevated) temperatures. Alternatively, a trueconditionally-expressed resolvase can be used, that is, a resolvase theexpression of which is completely repressed under certain conditions.

The normal T4 endonuclease VII enzyme cannot be expressed and stablymaintained in E. coli because it is lethal to those cells into which itis produced. This enzyme is a resolvase, and cleaves double stranded DNAa few bases downstream of any perturbation in the DNA. It thereforecleaves nucleic acids in which the two complementary strands are notperfectly matched. E. coli possesses its own DNA repair enzymes, but T4endonuclease VII, when cloned into E. coli, produces a double strandblunt cleavage at mismatches, and such cleavages cannot be repaired.Such mismatches arise due to replication errors. Therefore, when normal(i.e., unmutated) T4 endonuclease VII is cloned into E. coli andexpressed, it cleaves the host cell's DNA immediately after it has beensynthesized.

In vitro, T4 endonuclease VII cleaves mismatched DNA, but will alsodigest homoduplex DNA at a reduced level. In vivo, the enzyme'sspecificity is much higher.

In one embodiment described herein, a mutated T4 endonuclease VII genehas been introduced into E. coli cells. This mutated enzyme gene encodesan altered endonuclease enzyme that contains a temperature-sensitivemutation, so that the enzyme is active at permissive (e.g., 25° C.)temperatures, but exhibits little or no activity when the host cells aregrown at higher temperatures (e.g., 37° C.-42° C.). In the mutatedenzyme shown in FIG. 1 (SEQ ID NO:2), the enzyme is actually expressedwhen the host cells are grown or maintained at higher temperatures, butthe protein folds incorrectly, resulting in little or no activity atthose temperatures.

Once the gene encoding the mutated endogenous endonuclease has beentransformed into one or more host cells, the cells are grown at thenon-permissive temperature, and rendered competent via any methods knownin the art. The cells are then briefly maintained at the permissivetemperature for 0-3 hours to allow the cell(s) to produce activeresolvase. The cells are then frozen and stored, or are usedimmediately.

Double-stranded nucleic acids (e.g., amplification products) from whichone wishes to select perfectly matched products (e.g., homoduplexes) arethen ligated into an appropriate plasmid. The host cells are then thawed(if necessary), and are transformed with the plasmids containing thedouble-stranded nucleic acid. The active resolvase the acts to degradeany nucleic acid containing a perturbation, e.g., a mismatch.

One can rescue those plasmids containing homoduplexes by use of an F1helper phage. Such a phage excises plasmids, packages the DNA into aphage particle, and infects the phage-plasmid hybrid molecules (calledphagemids) into another F strain of bacteria. Preferably, cells of thisother F strain of bacteria are also in the mixture of competent cells,so that the transfer (i.e., the rescue) of the plasmids can be doneimmediately.

In vitro amplification of nucleic acids (e.g., by polymerase chainreaction) tends to be highly error-prone. If a mutation occurs in a veryearly cycle, it is possible that the final amplification products willinclude a high proportion of perfectly matched products, which are yetmutated relative to the original template nucleic acid. The possibilityof such an occurrence can be reduced by denaturing and reannealing thefinal PCR products, to increase the number of mismatches available foraction by the resolvase. The number of cloned products will perforce bereduced, but this method increases the chances of removing “false”homoduplexes.

The practice of the present invention will employ, unless otherwiseindicated, conventional techniques of molecular biology, cell biology,microbiology and recombinant DNA techniques, which are within the skillof the art. Such techniques are explained fully in the literature. See,e.g., Sambrook, Fritsch & Maniatis, 1989, Molecular Cloning: ALaboratory Manual, Second Edition; Oligonucleotide Synthesis (M. J.Gait, ed., 1984); Nucleic Acid Hybridization (B. D. Hames & S. J.Higgins, eds., 1984); A Practical Guide to Molecular Cloning (B. Perbal,1984); (Harlow, E. and Lane, D.) Using Antibodies: A Laboratory Manual(1999) Cold Spring Harbor Laboratory Press; and a series, Methods inEnzymology (Academic Press, Inc.); Short Protocols In Molecular Biology,(Ausubel et al., ed., 1995). All patents, patent applications, andpublications mentioned herein, both supra and infra, are herebyincorporated by reference in their entirety.

The invention provides a method for the cloning of a PCR product thathas a reduced number of PCR-induced mutations, relative to productscloned by other methods. The invention therefore provides for primersspecific for a test DNA template, methods of primer synthesis, methodsof DNA amplification and cloning into a phagemid cloning vector. Theinvention also features host cells that harbor an expression vectorcontaining a temperature sensitive mutant of T4 endonuclease VII. Themethod therefore provides for methods of mutagenesis and screeningprocedures required for the identification and isolation of host cellscontaining a temperature sensitive mutant of T4 endonuclease VII. Theinvention also provides protocols for the generation of competent hostcells, a denaturation and renaturation of the ligated DNA,transformation of the competent host cells with the mixture of DNAhomoduplex and heteroduplex molecules, maintenance of the host cells ata temperature permitting the expression of functional T4 endonucleaseVII and subsequent rescue by f1 helper phage.

Primers According to the Invention

The invention provides for oligonucleotide primers useful for amplifyingDNA or RNA sequences.

Primer Design

Primers may be selected manually by analyzing the template sequence.Computer programs, however, are also available in selecting primers togenerate an amplified product with a designed length, e.g., primerpremier 5 (available at the website of the company Premierbiosoft) andprimer3 (available at the website of the Whitehead Institute forBiomedical Research, Cambridge, Mass., U.S.A).

It is known in the art that primers that are about 20-25 bases long andwith 50% G-C content will work well at annealing temperature at about52-58° C. These properties are preferred when designing primers for thesubject invention. Longer primers, or primers with higher G-C contents,have annealing optimums at higher temperatures; similarly, shorterprimers, or primers with lower G-C contents, have optimal annealingproperties at lower temperatures. A convenient, simplified formula forobtaining a rough estimate of the melting temperature of a primer 17-25bases long is as follows:Melting temperature(Tm in ° C.)=4x(# of G+# of C)+2x(# of A+# of T)

Shorter fragments are amplified more efficiently than longer fragmentsalthough target of more than 10 kb can be successfully amplified.Preferably the primers are chosen so as to amplify an entire codingregion.

In accordance with the preferred embodiments, optimal results have beenobtained using primers, which are 19-25 in length. However, one skilledin the art will recognize that the length of the primers used may vary.For example, it is envisioned that shorter primers containing at least15, and preferably at least 17, may be suitable. The exact upper limitof the length of the primers is not critical. However, typically theprimers will be less than or equal to approximately 50 bases, preferablyless than or equal to 30 bases.

Primer Synthesis

Methods for synthesizing primers are available in the art. Theoligonucleotide primers of this invention may be prepared using anyconventional DNA synthesis method, such as, phosphotriester methods suchas described by Narang et al. (1979, Meth. Enzymol. 68:90) or Itakura(U.S. Pat. No. 4,356,270), or and phosphodiester methods such asdescribed by Brown et al. (1979, Meth. Enzymol. 68:109), or automatedembodiments thereof, as described by Mullis et al. (U.S. Pat. No.4,683,202). Also see particularly Sambrook et al. (1989), MolecularCloning: A Laboratory Manual (2d ed.; Cold Spring Harbor Laboratory:Plainview, N.Y.), herein incorporated by reference in its entirety.

Useful DNA Polymerases and Reverse Transcriptases

DNA polymerases and their properties are described in detail in, amongother places, DNA Replication 2nd edition, Kornberg and Baker, W. H.Freeman, New York, N.Y. (1991).

Known conventional DNA polymerases include, for example, Pyrococcusfuriosus (Pfu) DNA polymerase (Lundberg et al., 1991, Gene, 108: 1,provided by Stratagene, La Jolla, Calif., USA), Pyrococcus woesei (Pwo)DNA polymerase (Hinnisdaels et al., 1996, Biotechniques, 20:186-8,provided by Boehringer Mannheim, Roche Molecular Biochemicals,Indianapolis, Ind., USA), Thermus thermophilus (Tth) DNA polymerase(Myers and Gelfand 1991, Biochemistry 30:7661), Bacillusstearothermophilus DNA polymerase (Stenesh and McGowan, 1977, BiochimBiophys Acta 475:32), Thermococcus litoralis (Tli) DNA polymerase (alsoreferred to as Vent DNA polymerase, Cariello et al., 1991,Polynucleotides Res, 19: 4193, provided by New England Biolabs, Beverly,Mass., USA), 9°Nm DNA polymerase (discontinued product from New EnglandBiolabs, Beverly, Mass., USA), Thermotoga maritima (Tma) DNA polymerase(Diaz and Sabino, 1998 Braz J. Med. Res, 31:1239), Thermus aquaticus(Taq) DNA polymerase (Chien et al., 1976, J. Bacteoriol, 127: 1550),Pyrococcus kodakaraensis KOD DNA polymerase (Takagi et al., 1997, Appl.Environ. Microbiol. 63:4504), JDF-3 DNA polymerase (from Thermococcussp. JDF-3, Published International patent application WO 0132887),Pyrococcus GB-D (PGB-D) DNA polymerase (also referred as Deep-Vent DNApolymerase, Juncosa-Ginesta et al., 1994, Biotechniques, 16:820,provided by New England Biolabs, Beverly, Mass., USA), UlTma DNApolymerase (from thermophile Thermotoga maritima; Diaz and Sabino, 1998Braz. J. Med. Res. 31:1239; provided by PE Applied Biosystems, FosterCity, Calif., USA), Tgo DNA polymerase (from Thermococcus gorgonarius,provided by Roche Molecular Biochemicals, Indianapolis, Ind., USA), E.coli DNA polymerase I (Lecomte and Doubleday, 1983, Polynucleotides Res.11:7505), T7 DNA polymerase (Nordstrom et al., 1981, J. Biol. Chem.256:3112), and archaeal DP1/DP2 DNA polymerase II (Cann et al., 1998,Proc Natl Acad Sci USA 95:14250-5). The polymerization activity of anyof the above enzymes can be defined by means well known in the art. Oneunit of DNA polymerization activity of conventional DNA polymerase,according to the subject invention, is defined as the amount of enzymewhich catalyzes the incorporation of 10 nmoles of total deoxynucleotides(dNTPs) into polymeric form in 30 minutes at optimal temperature (e.g.,72° C. for Pfu DNA polymerase). Assays for DNA polymerase activity and3′-5′ exonuclease activity can be found in DNA Replication 2nd Ed.,Kornberg and Baker, supra; Enzymes, Dixon and Webb, Academic Press, SanDiego, Calif. (1979), as well as other publications available to theperson of ordinary skill in the art.

When using the subject compositions in reaction mixtures that areexposed to elevated temperatures, e.g., during the PCR technique, use ofthermostable DNA polymerases is preferred.

Reverse transcriptases useful according to the invention include, butare not limited to, reverse transcriptases from HIV, HTLV-1, HTLV-II,FeLV, FIV, SIV, AMV, MMTV, MoMuLV and other retroviruses (for reviews,see for example, Levin, 1997, Cell 88:5-8; Verma, 1977, Biochim.Biophys. Acta 473:1-38; Wu et al., 1975, CRC Crit. Rev. Biochem.3:289-347).

Phagemid Cloning Vectors Useful According to the Invention.

Methods well known to those skilled in the art can be used to constructphagemid cloning vectors containing a polynucleotide of the invention.These methods include in vitro recombinant DNA techniques, synthetictechniques and in vivo recombination/genetic recombination. See, forexample, the techniques described in Sambrook & Russell, MolecularCloning: A Laboratory Manual, 3^(rd) Edition, Cold Spring HarborLaboratory, N.Y. (2001) and Ausubel et al., Current Protocols inMolecular Biology (Greene Publishing Associates and Wiley Interscience,N.Y. (1989).

In a preferred embodiment, the invention provides for phagemid cloningvectors that typically contain an origin of DNA replication, e.g., acolE1 origin or any of a number of plasmid origins of replication, andalso a F1 origin of replication that enables phage controlled DNAreplication. A number of phagemid vectors are commercially availablesuch as the pBluescript II phagemids (Stratagene Catalog #212205,#212206, #212207 and #212208) which has an extensive polylinker with 21unique restriction enzyme recognition sites. Flanking the polylinker areT7 and T3 RNA polymerase promoters that can be used to synthesize RNA invitro. pBluescript II phagemids contain a 454-bp filamentous f1 phageintergenic region (M13 related), which includes the 307-bp origin ofreplication. The (+) and (−) orientations of the f1 intergenic regionallow the rescue of sense or antisense ssDNA by a helper phage whichpromotes the packaging of the replicated single stranded phagemid DNAinto infectious phage particles that are secreted into the media.Phagemids therefore permit the rescue of cloned sequences without theneed for traditional subcloning methods.

Expression Vectors According to the Invention

The invention provides for vectors for the expression of variants ofendonuclease VII of phage T4. Appropriate cloning and expression vectorsfor use with prokaryotic and eukaryotic hosts are described by Sambrooket al., in Molecular Cloning: A Laboratory Manual, Second Edition, ColdSpring Harbor, N.Y. (1989), the disclosure of which is incorporatedherein by reference in its entirety.

The DNA sequence in the expression vector is operatively linked to anappropriate expression control sequence(s) (promoter) to direct mRNAsynthesis. Examples of such promoters include but are not limited to:LTR or SV40 promoter in mammalian cells, the E. coli. lac or trp, thephage P_(L) promoter and other promoters known to control expression ofgenes in prokaryotic or eukaryotic cells or their viruses. Theexpression vector also contains a ribosome binding site for translationinitiation and a transcription terminator. The vector may also includeappropriate sequences for amplifying expression.

In addition, the expression vectors preferably contain a gene to providea phenotypic trait for selection of transformed host cells such asdihydrofolate reductase or neomycin resistance for eukaryotic cellculture, or such as tetracycline or ampicillin resistance in E. coli.The vector containing the appropriate DNA sequence as hereinabovedescribed, as well as an appropriate promoter or control sequence, maybe employed to transform an appropriate host to permit the host toexpress the protein.

In one embodiment, the variant T4 endonuclease VII genes are cloned intoa pACYC 184 expression vector that contains a multiple cloning site(MCS), the lac operator, and P15A that is compatible plasmids containinga colE1 origin of replication (Nakano et al. (1995) Gene 162: 157-158).In another embodiment, the mismatch recognizing enzyme or protein iscloned into a cosmid vector.

Host Cells Useful According to the Invention

The present invention further provides host cells containing the vectorsof the present invention, wherein the nucleic acid has been introducedinto the host cell using known transformation, transfection or infectionmethods. The host cell can be a eukaryotic host cell, such as amammalian cell, a plant cell, a lower eukaryotic host cell, such as ayeast cell, or the host cell can be a prokaryotic cell, such as abacterial cell.

Mammalian host cells include, for example, monkey COS cells, ChineseHamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A431cells, human Colo205 cells, 3T3 cells, CV-1 cells, other transformedprimate cell lines, normal diploid cells, cell strains derived from invitro culture of primary tissue, primary explants, HeLa cells, mouse Lcells, BHK, HL-60, U937, HaK or Jurkat cells.

In yeast, a number of vectors containing constitutive or induciblepromoters may be used. For a review see, Current Protocols in MolecularBiology, Vol. 2, Ed. Ausubel et al., Greene Publish. Assoc. & WileyInterscience, Ch. 13 (1988); Grant et al. (1987) “Expression andSecretion Vectors for Yeast”, Methods Enzymol. 153:516-544; Glover, DNACloning, Vol. II, IRL Press, Wash., D.C., Ch. 3 (1986); Bitter,Heterologous Gene Expression in Yeast, Methods Enzymol. 152:673-684(1987); and The Molecular Biology of the Yeast Saccharomyces, Eds.Strathern et al., Cold Spring Harbor Press, Vols. I and II (1982).

In a preferred embodiment, the host of the invention is a prokaryoticcell such as E. coli, other enterobacteriaceae such as Salmonellatyphimurium, bacilli, various pseudomonads, or other prokaryotes whichcan be transformed, transfected, and/or infected. The present inventionfurther provides host cells genetically engineered to contain thepolynucleotides of the invention. For example, such host cells maycontain nucleic acids of the invention introduced into the host cellusing known transformation, transfection or infection methods. Thepresent invention still further provides host cells geneticallyengineered to express the polynucleotides of the invention, wherein suchpolynucleotides are in operative association with a regulatory sequenceheterologous to the host cell, which drives expression of thepolynucleotides in the cell.

In another preferred embodiment, the host strain XL1-Blue MRF[Stratagene (Catalog #200301); Genotype: k(mcrA)183k(mcrCB-hsdSMR-mrr)173 endA1 supE44 thi-1 recA1 gyrA96 relA1 lac [FproAB lacI<ZkM15 Tn10 (Tet′)] is used for the propagation of pBluescriptII phagemids and for transformation of recombinant phagemids.

In another preferred embodiment, the host cell conditionally expressesone or more resolvases of prokaryotic or eukaryotic origin.

In another preferred embodiment, the host cell of the invention isXL1-Blue MRF that conditionally expressed a variant of the endonucleaseVII of T4 phage.

In another preferred embodiment, the host cell of the invention isXL1-Blue MRF that conditionally expressed a temperature-sensitive mutantof the endonuclease VII of T4 phage.

Generation of Conditionally Expressed Endonuclease VII of Phage T4

The T4 endonuclease VII gene product, if expressed in E. coli, is lethalto the host cells since during the normal bacterial life cycle,replication errors become sites for endonuclease cleavage and ultimatelyleads to the destruction of the chromosomal integrity and cell death. Toovercome the inherent toxicity, the T4 endonuclease VII gene is randomlymutagenized, cloned into an expression vector and then transformed intoan appropriate host cell such as XL1-Blue MRF. Transformants are thenscreened for the presence of conditional lethal mutations within the T4endonuclease VII open reading frame that permit cell growth atpermissive conditions but not a non-permissive conditions.

Mutagenesis

Methods of random mutagenesis that generate one or morerandomly-situated mutations are known in the art. One example of such amethod is to clone the sequence of interest into a strain of E. colithat has a high spontaneous mutation rate (i.e., a “mutator strain”,e.g., E. coli strain XL1-RED). It was this method that was used togenerate the mutant T4 endonuclease VII of the present invention.

An example of a method for random mutagenesis is the so-called“error-prone PCR method”. As the name implies, the method amplifies agiven sequence under conditions in which the DNA polymerase does notsupport high fidelity incorporation. The conditions encouragingerror-prone incorporation for different DNA polymerases vary, howeverone skilled in the art may determine such conditions for a given enzyme.A key variable for many DNA polymerases in the fidelity of amplificationis, for example, the type and concentration of divalent metal ion in thebuffer. The use of manganese ion and/or variation of the magnesium ormanganese ion concentration may therefore be applied to influence theerror rate of the polymerase. The resulting PCR product is cloned intomultiple cloning site downstream of the IPTG inducible lactose promoterof the expression vector, pACYC 184 using standard recombinant DNAtechniques that are known to one of skill in the art. The recombinantexpression vector is then transformed into competent XL1-Blue MR andplated out on Luria broth plates containing chloramphenicol (50 μg/ml)according to standard procedure.

Screening for Temperature Sensitive Mutants of T4 Endonuclease VII

Chloramphenicol resistant colonies are then screened in the presence ofthe IPTG inducer (0.1-1 mM) for mutants of T4 endonuclease VII that aretemperature sensitive. These mutants grow normally at temperatures of37° C. and above, where the enzyme is inactive, but are incapable ofgrowth at room temperature whether the T4 endonuclease expression wasinduced or not. The nucleotide sequence of the temperature sensitivemutant of T4 endonuclease VII is provided in SEQ ID NO:1. The wild typeversion is shown in SEQ ID NO:2. Host cells carrying the temperaturesensitive T4 endonuclease VII gene can then be propagated at thepermissive temperature of 37° C.

Generation of Competent Cells

Prior to making these cells competent for transformation (chemical orelectrocompetent) expression of the temperature sensitive T4endonuclease VII enzyme is turned on by addition of IPTG inducer (1 mM)and the inactive enzyme reactivated by its presumed correct folding attemperatures below 37° C. (see Example 2, below).

Methods of generating competent cells are well known to those skilled inthe art. Typical methods of generating competent cells comprise growingcells to log phase or early stationary phase and exposing the cells toCaCl₂ or other cationic compound (see, e.g., Sambrook et al., InMolecular Cloning: a Laboratory Manual, 2nd Edition, eds. Sambrook etal., Cold Spring Harbor Laboratory Press, (1989)). Cells can becontacted immediately with heterologous DNA or frozen in glycerol orDMSO for subsequent use. Upon thawing to 4° C. and contacting withplasmid DNA; frozen competent cells typically have transformationefficiencies of 1×10⁵-1×10⁹ transformants/μg of plasmid DNA.

Generation of Electrocompetent Cells

Electroporation has also been used to transform cells (see, e.g., Doweret al., Nucleic Acids Research 16:6127-6145 (1988); Taketo, Biochimicaet Biophysica Acta 949:318-324 (1988); Chassy and Flickinger, FEMSMicrobiology Letters 44:173-177 (1987); and Harlander, StreptococcalGenetics, eds. Ferretti and Curtiss, American Society of Microbiology,Washington, D.C., pp. 229-233 (1987)). Electroporation methods rely oncreating temporary holes in cell membranes by exposing cells to a highvoltage electric impulse to facilitate the uptake of heterologousnucleic acids (see, e.g., Andreason and Evans, Biotechniques 6:650-660(1988)). Cells exposed to an electroporation buffer (e.g., 10-15%glycerol) are generally stored by freezing to provide a supply ofelectrocompetent cells (see, e.g., U.S. Pat. No. 6,004,804).

DNA Denaturation and Renaturation According to the Invention

The formation of a duplex is accomplished by denaturing and thenannealing two homologous and complementary nucleic acid strands in ahybridization reaction. The hybridization reaction can be made to behighly specific by adjustment of the hybridization conditions (oftenreferred to as hybridization stringency) under which the hybridizationreaction takes place, such that hybridization between two nucleic acidstrands will not form a stable duplex, e.g., a duplex that retains aregion of double-strandedness under normal stringency conditions, unlessthe two nucleic acid strands contain a certain number of nucleotides inspecific sequences which are substantially, or completely,complementary.

Preferably, single-stranded DNA is prepared by denaturingdouble-stranded test DNA in distilled water with heat (i.e., between 90°C. and 100° C.). Those skilled in the art will appreciate that DNAdenaturation can also be accomplished by heat denaturation, followed byrenaturation at progressively lower temperatures. Heteroduplex formationbetween the different strands of a heat denatured test DNA is performedin 50 μl (total volume) containing 1 times annealing buffer aspreviously described (Cotton, Methods in Molecular Biology 9:39 (1991))except that the annealing temperature is set at 65° C. for 1 hourfollowed by 20 minutes at room temperature.

Phage Rescue According to the Invention

M13KO7 (Stratagene (La Jolla, Calif., USA) Catalog No.:#NO315S) is anM13 derived helper phage that carries the mutation Met40IIe in gII, theorigin of replication from P15A and the kanamycin resistance gene fromTn903 both inserted within the M13 origin of replication. M13KO7 is ableto replicate in the absence of phagemid DNA. In the presence of aphagemid bearing a wild-type M13 or f1 origin such as pBluescript II,single-stranded phagemid is packaged preferentially and secreted intothe culture medium. Since these filamentous helper phages (M13, fI) willnot infect E. coli without an F episome coding for pili, it is essentialto use XL1-Elue MRF or a similar strain containing the F episome.Typically, 30-50 pBluescript II molecules are packaged/helper phage DNAmolecule. pBluescript II phagemids are offered with the IG region ineither of two orientations: pBluescript II (+) is replicated such thatthe sense strand of the β-galactosidase gene is secreted within thephage particles; pBluescript II (−) is replicated such that theantisense strand of the β-galactosidase gene is secreted in the phageparticles. Yields of single-stranded (ss)DNA depend on the specificinsert sequence. For most inserts, over 1 pg of ssDNA can be obtainedfrom a 1.5 ml miniculture.

Generation of Competent Cells Harboring a Temperature Sensitive T4Endonuclease VII

In vitro experiments demonstrate that T4 endonuclease VII not onlydigests heteroduplex DNA at the sites of mismatch but also homoduplexDNA at random (but reproducible) sites. These preliminary experimentsthus established that, unless other discriminatory factors arediscovered, T4 endonuclease VII can only be used in an in vivo setting,where the accuracy of the enzyme ensures that homoduplex DNA is nevercleaved. However, the T4 endonuclease gene product, if expressed in E.coli, is lethal to the host cells since during the normal bacterial lifecycle, replication errors (which are normally repairedpost-replicationally) become sites for endonuclease cleavage and leadsto the destruction of the chromosomal integrity and eventual cell death.To overcome this problem, T4 endonuclease VII was mutagenized (seeabove), cloned into an pACYC expression vector downstream of the lactosepromoter and transformed into a XL1-Blue MRF containing a pCALnEKplasmid that expresses the lactose high affinity repressor (lacI^(q))and a second expression plasmid (R6K) that enhances the repression inthis setting. Transformants were then screened for the presence oftemperature sensitive mutations and a conditional lethal mutant wasisolated that grows normally at temperatures above 37° C. (enzymeinactive) but were incapable of growth at room temperature whether theT4 endonuclease expression was induced or not.

Chemical methods of making competent cells as well as electrocompetentcells are well known in the art cells (see U.S. Pat. Nos. 5,512,468;6,338,965 and 6,040,184 and references cited therein). Prior to DNAtransformation, expression of the temperature sensitive T4 endonucleaseVII can be both turned on and the inactive enzyme reactivated by itspresumed correct folding at temperatures below 37° C. During thisrelatively brief period, the E. coli host cells have not yet died (andthus can be transformed with plasmid DNA) but will contain activeendonuclease to cleave incoming DNA containing a mismatch.

PCR Amplification and Ligation

Methods for PCR cloning of a target DNA sequence are well known to thoseskilled in the art. A number of kits are now commercially available thatprovide all reagents and reaction conditions for the generation of PCRDNA fragments and subsequent cloning into a phagemid cloning vector. Inone embodiment of the present invention, PCR fragments are generatedusing a TOPO cloning kit (InVitrogen, San Diego, Calif., USA),PCR-Script® cloning kit (Stratagene, La Jolla, Calif., USA); catalog#211188), or other method or kit. PCR products are incubated with one ofthe predigested PCR-Script phagemid cloning vectors, Srf I and T4 DNAligase. Using the restriction enzyme in the ligation reaction maintainsa high-steady-state concentration of digested vector DNA and allows theuse of nonphosphorylated, unmodified PCR primers. The ligationefficiency of blunt-ended DNA fragments is increased by thesimultaneous, opposite reactions of the Srf I restriction enzyme and T4DNA ligase on nonrecombinant vector DNA. The PCR-Script cloning kitsallow rapid and efficient cloning of all PCR products, regardless of thePCR enzyme used to generate the inserts. PCR-Script kits also includethe StrataPrep® PCR purification kit for easy and efficient removal ofprimers and nonspecific amplification products smaller than 100 bp.

The PCR product is diluted with distilled water and heat denatured andrenatured before it is ligated to the vector. Induced competent hostcells containing activated temperature sensitive T4 endonuclease VIIencoded by SEQ ID NO:1 are then added to the heat denatured, renaturedPCR ligation mix. Once the DNA to be transformed has entered the cellsthat express T4 endonuclease VII, those DNA molecules that containmutations are cleaved by the active T4 endonuclease VII whereas thoseDNA molecules that are mutation free are not digested by endonucleaseVII. Since these host cells will eventually die due to the lethality ofT4 endonuclease VII, the non mutant DNAs are rescued In a firstpreferred embodiment, the host cells are infected with a f1 helper phagethat encodes the enzymes necessary to replicate plasmids containing a f1origin or replication such as pUC and pBluescript and the like. Thisreplication is followed by packaging of a single stranded molecule thatcan then infect a secondary host strain that harbors the F′ episomeIncoming DNA molecules that harbor mutations and are cleaved by T4endonuclease VII are unable to be packaged. In another embodiment,non-mutant plasmids are rescued by conjugal mating. In this scenario,the incoming plasmid DNA molecule should contain an origin of conjugaltransfer (oriT; Ditta, G. et. al., Proc. Natl. Acad. Sci. USA77:7347-51, 1980; Hengen, P. N. and V. N. Iyer, Biotechniques 13:56-62(1992)) and a second expression plasmid encoding the conjugal transferproteins.

Conditional Activity

In one embodiment, the activity of the enzyme is inhibited by heat(e.g., by exposure to temperatures if 37-39° C.). In such an embodiment,the enzyme has a mutation that causes the protein to fold incorrectly atthe warmer temperatures, resulting in an inhibition of the enzyme'sactivity. Upon exposure to cooler temperatures, the protein re-foldscorrectly, and is active.

Enzymes bind and/or cleave mismatched nucleic acids, and which exhibitconditional activity/inactivity can be used in the present invention.For instance, an enzyme which is active at warmer temperatures, but isinactive at colder temperatures, can be used.

Other inhibition and repression systems can also be used. For instance,enzymes the activity of which are dependent on particular ionicconditions or a particular pH can be used in the invention.

In another embodiment of the invention, the activity of the enzyme isdependent on the availability of an inducer, e.g., a sugar, and aminoacid, or other metabolite or compound.

The enzyme can also be controlled by a regulatory nucleic acid. Forinstance, a promoter that is activated by cold, by heat, ionicconcentrations, pH levels, UV irradiation, EMS (ethylmethanesulfonate)damage, or other conditions. For instance, in the λC1 repressor and λpromoter system, the promoter is only active under elevatedtemperatures.

Activity Levels

If the enzyme used in the invention is highly active, or exhibits somenon-specific cleavage (i.e., cleavage at homoduplex, as well asheteroduplex sites), then it is desirable that the control of theenzyme's activity be complete, that is, that repression is complete whenthe enzyme is exposed to inhibitory conditions. If the enzyme is nothighly active, or if the enzyme is very specific in its activity, thenone can use an enzyme that is not completely inhibited under theinhibitory conditions.

For instance, T4 endonuclease VII is very highly active, and even asmall amount of active enzyme (i.e., as would be found under a “leaky”regulatory system) is lethal to the host E. coli cell. On the otherhand, T7 endonuclease I exhibits a lower activity, and is not lethal tothe host cell when repression of the enzyme is not compete.

Assays

To assay the effectiveness of the methods described herein, and toverify that one has selectively cloned homoduplex nucleic acidmolecules, certain control assays can be used.

In one assay, two different restriction fragments are used as controlsto form hetero- and homoduplexes. One is a double-stranded restrictionfragment containing a nucleic acid sequence (SEQ ID NO:5; FIG. 5)encoding a Green Fluorescent Protein (GFP) (SEQ ID NO:6; FIG. 5), andthe other is a restriction fragment that contains a nucleic acidsequence (SEQ ID NO:7; FIG. 6) encoding a mutant Green FluorescentProtein (G^(W)FP) (SEQ ID NO:8; FIG. 6), which contains a premature stopcodon. The normal GFP, when cloned into and expressed constitutively ina host cell, results in the cell fluorescing green under ultravioletlight. A host cell expressing the mutant GFP protein does not fluoresce,i.e., it is white, rather than green.

The two restriction fragments are mixed together, e.g., in a 9:1 ratio(nine parts of GFP-encoding restriction fragment to one part mutantGFP-encoding restriction fragment). The mixture is denatured, renatured,ligated into a vector, and transformed into a host cell of theinvention, e.g., a host cell expressing a conditionally-active T4endonuclease VII. As an additional control, the mixture can also betransformed into a wild type cell, e.g., a host cell not expressing aconditionally-active T4 endonuclease VII. The host cells are then placedunder conditions to allow the conditionally-expressed T4 endonucleaseVII to act on any heteroduplexes that might be present.

The wild type cells not expressing the enzyme will express the normalGFP and the mutant GFP, and will fluoresce green at a ratio of ninegreen cells to one non-fluorescent cell. In the cells expressing theconditionally active T4 endonuclease VII, the cells containinghomoduplexes of the normal GFP nucleic acid sequence will be present ata rate of the square of the input ratio over one, in this case, 81:1, or9²:1.

For a given starting ratio of X:1 of the two fragments, therefore, hostcells not expressing the T4 endonuclease VII will exhibit a ratio ofX:1, while host cells expressing the T4 endonuclease VII will exhibit aratio of X²:1. It is therefore preferable to use the normal GFP at anexcess relative to the mutant GFP, as it will be much simpler todetermine the difference between ratios of fluorescing cells of 9:1 vs.81:1 (from a starting ratio of 9:1 normal vs. mutant GFP), than it wouldbe to determine the difference between ratios of 2:1 and 4:1 (from astarting ratio of 2:1 normal vs. mutant GFP).

Other assays, using other fluorescent or otherwise detectable expressionproducts, can also be used.

Kits

The invention is intended to provide novel compositions and methods forthe cloning of PCR amplification products devoid of PCR-inducedmutations, as described herein. The invention herein also contemplates akit format that comprises a package unit having one or more containersof the subject composition and in some embodiments including containersof various reagents used for polynucleotide synthesis, includingsynthesis in PCR. The kit may also contain one or more of the followingitems: polymerization enzymes, dNTPs, primers, buffers, antibiotics,helper phage, instructions, and controls. The kits may includecontainers of reagents mixed together in suitable proportions forperforming the methods in accordance with the invention. Reagentcontainers preferably contain reagents in unit quantities that obviatemeasuring steps when performing the subject methods. In one embodiment,the kit contains XL1-Blue MRF cells containing a pCALnEK plasmid thatexpresses the lactose high affinity repressor (lacI^(q)), a secondexpression plasmid (R6K) that enhances the repression and pACYC 184expression plasmid containing SEQ ID NO:1, which encodes the temperaturesensitive T4 endonuclease VII.

EXAMPLES Example 1 Preparation of Plasmid Ligation Mixture

The amplified PCR product of interest is separated from the PCRamplification enzyme by any conventional method (i.e., resin or spin cupto remove enzyme).

The DNA is then heated to 95° C. for 5 minutes, then allowed to slowlycool at room temperature (or in a thermal heating block whosetemperature can be ramped down) until the mixture reaches roomtemperature. This material is then ligated or annealed to the plasmid ofinterest. This DNA can also be transformed directly into a host cell,depending on the application and the ultimate goal of the experiment.For instance, the transforming DNA can be recombined with a plasmid thatis already present in the host, the DNA may recombine with thechromosomal DNA, the DNA may replicate as a linear molecule, etc.

Example 2 Preparation of Host Cells

The E. coli host cells harboring the T4 Endonuclease VII (and thelacI^(q) encoding repressor plasmid, the lacI^(q) enhancer plasmid andthe F′ episome) are grown at 37° C. or above until the cells reach anoptical density of 0.2 (at 550 nm). The inducer molecule (IPTG) is addedand the cells are maintained at 42° C. for 30 minutes. The cells arerapidly brought to room temperature and are maintained there for aperiod of 0-3 hours. These cells are then made competent by standardmethods (chemically competent or electrocompetent) and frozen at −80° C.prior to use.

These “competent” cells are then thawed, an appropriate volume withdrawnand the plasmid ligation mixture added. Transformation protocol isstandard. After heat pulse or electric shock, f1 helper phage(containing the enzymes necessary to replicate and package plasmid DNAmolecules containing the f1 phage origin of replication) is added at thesame time as the terminal acceptor strain (any E. coli containing the F′episome required for f1 infection). The f1 helper phage will package anyplasmid DNA molecule containing the f1 origin of replication—those thathave been subject to T4 endonuclease VII will not be contiguous andcannot be packaged. The packaged phagemid is then readily transferred tothe terminal host.

All patents, patent applications, and published references cited hereinare hereby incorporated by reference in their entirety. While thisinvention has been particularly shown and described with references topreferred embodiments thereof, it will be understood by those skilled inthe art that various changes in form and details may be made thereinwithout departing from the scope of the invention encompassed by theappended claims.

1. A host cell for selectively cloning homoduplex nucleic acidmolecules, wherein the host cell is: (a) competent; (b) contains a geneencoding a resolvase that is conditionally expressed and/orconditionally active; and (c) can be maintained under conditions underwhich the resolvase is not expressed or repressed and propagated ininactive form.
 2. The host cell of claim 1, wherein said host cell iseukaryotic.
 3. The host cell of claim 1, wherein said host cell isprokaryotic.
 4. The host cell of claim 1, wherein said conditionallyexpressed resolvase is selected from the group consisting of: abacteriophage resolvase, a prokaryotic resolvase, and a eukaryoticresolvase.
 5. The host cell of claim 1, wherein said conditionallyexpressed resolvase is bacteriophage T4 Endonuclease VII.
 6. The hostcell of claim 1, wherein said conditionally expressed resolvase is atemperature sensitive mutant of bacteriophage T4 Endonuclease VII. 7.The host cell of claim 1, wherein said conditionally expressed resolvaseis a temperature sensitive mutant of bacteriophage Endonuclease VIIencoded by SEQ ID NO:1. 8-29. (canceled)
 30. A kit comprising acompetent host cell containing a heterologous gene encoding aconditionally expressed resolvase, and packaging materials therefore.31. A kit comprising a competent host cell containing a heterologousgene encoding a conditionally expressed resolvase, wherein the host cellcan be maintained under conditions under which the resolvase isexpressed and packaging materials therefore.
 32. A kit comprising a hostcell containing an expression vector encoding a temperature sensitivemutant of bacteriophage T4 Endonuclease VII of SEQ ID NO:2, PCRreagents, helper phage and packaging materials therefore. 33-38.(canceled)